Generic-case Complexity
   HOME

TheInfoList



OR:

Generic-case complexity is a subfield of
computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by ...
that studies the complexity of computational problems on "most inputs". Generic-case complexity is a way of measuring the complexity of a
computational problem In theoretical computer science, a computational problem is a problem that may be solved by an algorithm. For example, the problem of factoring :"Given a positive integer ''n'', find a nontrivial prime factor of ''n''." is a computational probl ...
by neglecting a small set of unrepresentative inputs and considering
worst-case complexity In computer science (specifically computational complexity theory), the worst-case complexity measures the resources (e.g. running time, memory) that an algorithm requires given an input of arbitrary size (commonly denoted as in asymptotic notati ...
on the rest. Small is defined in terms of asymptotic density. The apparent efficacy of generic case complexity is because for a wide variety of concrete computational problems, the most difficult instances seem to be rare. Typical instances are relatively easy. This approach to complexity originated in
combinatorial group theory In mathematics, combinatorial group theory is the theory of free groups, and the concept of a presentation of a group by generators and relations. It is much used in geometric topology, the fundamental group of a simplicial complex having in a nat ...
, which has a computational tradition going back to the beginning of the last century. The notion of generic complexity was introduced in a 2003 paper,I. Kapovich, A. Myasnikov, P. Schupp and V. Shpilrain,
Generic case complexity, decision problems in group theory and random walks
', J. Algebra, vol 264 (2003), 665–694.
where authors showed that for a large class of
finitely generated group In algebra, a finitely generated group is a group ''G'' that has some finite generating set ''S'' so that every element of ''G'' can be written as the combination (under the group operation) of finitely many elements of ''S'' and of inverses of s ...
s the generic time complexity of some classical decision problems from combinatorial group theory, namely the word problem,
conjugacy problem In abstract algebra, the conjugacy problem for a group ''G'' with a given presentation is the decision problem of determining, given two words ''x'' and ''y'' in ''G'', whether or not they represent conjugate elements of ''G''. That is, the probl ...
and
membership problem Member may refer to: * Military jury, referred to as "Members" in military jargon * Element (mathematics), an object that belongs to a mathematical set * In object-oriented programming, a member of a class ** Field (computer science), entries in ...
, are linear. A detailed introduction of generic case complexity can be found in the surveys.R. Gilman, A. G. Miasnikov, A. D. Myasnikov, and A. Ushakov, ''Generic Case Complexity'', unpublished first draft of a book, 143 pages.


Basic definitions


Asymptotic density

Let ''I'' be an
infinite set In set theory, an infinite set is a set that is not a finite set. Infinite sets may be countable or uncountable. Properties The set of natural numbers (whose existence is postulated by the axiom of infinity) is infinite. It is the only set th ...
of inputs for a computational problem. Definition 1. A size function on ''I'' is a map \sigma:I\to \mathbb with infinite range. The ball of radius ''n'' is B_n=\. If the inputs are coded as strings over a finite alphabet, size might be the string length. Let \ be an ensemble of
probability distributions In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
where \mu_n is a
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
on B_n. If the balls B_n are finite, then each \mu_n can be taken to be the equiprobable distribution which is the most common case. Notice that only finitely many B_n's can be empty or have \mu_n(B_n) = 0; we ignore them. Definition 2. The asymptotic density of a subset X \subset I is \rho(X) = \lim_\mu_n(X \cap B_n) when this limit exists. When the balls B_n are finite and \mu_n is the equiprobable measure, : \rho(X)=\lim \frac. In this case it is often convenient to use spheres I_n=\ instead of balls and define \rho'(X)=\lim \frac. An argument using Stolz theorem shows that \rho(X) exists if \rho'(X) does, and in that case they are equal. Definition 3 X\subseteq I is generic if \rho(X)=1 and negligible if \rho(X)=0. ''X'' is exponentially (superpolynomially) generic if the convergence to the limit in Definition 2 is exponentially (superpolynomially) fast, etc. A generic subset ''X'' is asymptotically large. Whether ''X'' appears large in practice depends on how fast \mu_n(X\cap B_n) converges to 1. Superpolynomial convergence seems to be fast enough.


Generic complexity classes

Definition 4 An
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
is in ''GenP'' (generically polynomial time) if it never gives incorrect answers and if it gives correct answers in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
on a generic set of inputs. A problem is in ''GenP'' if it admits an algorithm in ''GenP''. Likewise for ''GenL'' (generically
linear time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
), ''GenE'' (generically
exponential time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
with a linear exponent) ''GenExp'' (generically exponential time), etc. ''ExpGenP'' is the subclass of ''GenP'' for which the relevant generic set is exponentially generic. More generally for any f : \mathbb \to \mathbb we can define the class ''Gen(f)'' corresponding to
time complexity In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
''O''(''f'') on a generic set of input. Definition 5. An algorithm solves a problem generically if it never gives incorrect answers and if it gives correct answers on a generic set of inputs. A problem is generically solvable if it is solved generically by some algorithm.


Theory and applications


Combinatorial group theory problems

* The famous
undecidable problem In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is proved to be impossible to construct an algorithm that always leads to a correct yes-or-no answer. The halting problem is an ...
s: under suitable hypotheses, the word, conjugacy and membership decision problems are in generically polynomial. * The word and conjugacy
search problem In computational complexity theory and computability theory, a search problem is a type of computational problem represented by a binary relation. If ''R'' is a binary relation such that field(''R'') ⊆ Γ+ and ''T'' is a Turing machine, then '' ...
s are in ''GenP'' for all fixed finitely presented groups. * The well known
coset enumeration In mathematics, coset enumeration is the problem of counting the cosets of a subgroup ''H'' of a group ''G'' given in terms of a presentation. As a by-product, one obtains a permutation representation for ''G'' on the cosets of ''H''. If ''H'' has ...
procedure admits a computable upper bound on a generic set of inputs. * The Whitehead algorithm for testing whether or not one element of a free group is mapped to another by an automorphism has an exponential worst case upper bound but runs well in practice. The algorithm is shown to be in ''GenL''. * The conjugacy problem in
HNN extension In mathematics, the HNN extension is an important construction of combinatorial group theory. Introduced in a 1949 paper ''Embedding Theorems for Groups'' by Graham Higman, Bernhard Neumann, and Hanna Neumann, it embeds a given group ''G'' into an ...
s can be unsolvable even for
free group In mathematics, the free group ''F'S'' over a given set ''S'' consists of all words that can be built from members of ''S'', considering two words to be different unless their equality follows from the group axioms (e.g. ''st'' = ''suu''−1' ...
s. However, it is generically cubic time.


The halting problem and the Post correspondence problem

* The
halting problem In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. Alan Turing proved in 1936 that a g ...
for
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
with one-sided tape is easily decidable most of the time; it is in ''GenP'' The situation for two-sided tape is unknown. However, there is a kind of lower bound for machines of both types. The halting problem is not in ''ExpGenP'' for any model of Turing machine, * The
Post correspondence problem The Post correspondence problem is an undecidable decision problem that was introduced by Emil Post in 1946. Because it is simpler than the halting problem and the ''Entscheidungsproblem'' it is often used in proofs of undecidability. Definiti ...
is in ''ExpGenP''.


Presburger arithmetic

The
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...
for
Presburger arithmetic Presburger arithmetic is the first-order theory of the natural numbers with addition, named in honor of Mojżesz Presburger, who introduced it in 1929. The signature of Presburger arithmetic contains only the addition operation and equality, omitti ...
admits a double exponential worst case lower bound and a triple exponential worst case upper bound. The generic complexity is not known, but it is known that the problem is not in ''ExpGenP''.


NP complete problems

As it is well known that
NP-complete problems In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying ...
can be easy on average, it is not a surprise that several of them are generically easy too. * The three
satisfiability problem In mathematical logic, a formula is ''satisfiable'' if it is true under some assignment of values to its variables. For example, the formula x+3=y is satisfiable because it is true when x=3 and y=6, while the formula x+1=x is not satisfiable over ...
is in ''ExpGenP''. * The
subset sum problem The subset sum problem (SSP) is a decision problem in computer science. In its most general formulation, there is a multiset S of integers and a target-sum T, and the question is to decide whether any subset of the integers sum to precisely T''.'' T ...
is in ''GenP''.


One way functions

There is a generic complexity version of a
one-way function In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, spe ...
which yields the same class of functions but allows one to consider different security assumptions than usual.


Public-key cryptography

A series of articles is devoted to cryptanalysis of the
Anshel–Anshel–Goldfeld key exchange Anshel–Anshel–Goldfeld protocol, also known as a commutator key exchange, is a key-exchange protocol using nonabelian groups. It was invented by Drs. Michael Anshel, Iris Anshel, and Dorian M. Goldfeld, Dorian Goldfeld. Unlike other group-based ...
protocol, whose security is based on assumptions about the
braid group A braid (also referred to as a plait) is a complex structure or pattern formed by interlacing two or more strands of flexible material such as textile yarns, wire, or hair. The simplest and most common version is a flat, solid, three-strande ...
. This series culminates in Miasnikov and Ushakov (2008) which applies techniques from generic case complexity to obtain a complete analysis of the length based attack and the conditions under which it works. The generic point of view also suggests a kind of new attack called the quotient attack, and a more secure version of the Anshel–Anshel–Goldfeld protocol.


List of general theoretical results

*A famous
Rice's theorem In computability theory, Rice's theorem states that all non-trivial semantic properties of programs are undecidable. A semantic property is one about the program's behavior (for instance, does the program terminate for all inputs), unlike a synta ...
states that if ''F'' is a subset of the set of partial computable functions from \mathbb to \, then unless ''F'' or its complement is empty, the problem of deciding whether or not a particular
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
computes a function in ''F'' is undecidable. The following theorem gives a generic version. Theorem 1 Let ''I'' be the set of all Turing machines. If ''F'' is a subset of the set of all partial computable function from \mathbb to itself such that ''F'' and its complement are both non-empty, then the problem of deciding whether or not a given Turing machine computes a function from ''F'' is not decidable on any exponentially generic subset of ''I''. * The following theorems are from: Theorem 2 The set of
formal languages In logic, mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (formal languages), alphabet and are well-formedness, well-formed ...
which are generically computable has measure zero. Theorem 3 There is an infinite hierarchy of generic complexity classes. More precisely for a proper complexity function ''f'', Gen(f) \subsetneq Gen(f^3). The next theorem shows that just as there are average case complete problems within distributional NP problems, there are also generic case complete problems. The arguments in the generic case are similar to those in the average case, and the generic case complete problem is also average case complete. It is the distributional bounded halting problem. Theorem 4 There is a notion of generic-polynomial-time reduction with respect to which the distributional bounded halting problem is complete within class of distributional NP problems.


Comparisons with previous work


Almost polynomial time

Meyer and Paterson define an algorithm to be almost polynomial time, or APT, if it halts within ''p(n)'' steps on all but ''p(n)'' inputs of size ''n''. Clearly, APT algorithms are included in our class ''GenP''. We have seen several
NP complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying ...
problems in ''GenP'', but Meyer and Paterson show that this is not the case for APT. They prove that an NP complete problem is reducible to a problem in APT if and only if
P = NP The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved. The informal term ''quickly'', used abov ...
. Thus APT seems much more restrictive than ''GenP''.


Average-case complexity

Generic case complexity is similar to
average-case complexity In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case comp ...
. However, there are some significant differences. Generic case complexity is a direct measure of the performance of an algorithm on most inputs while average case complexity gives a measure of the balance between easy and difficult instances. In addition Generic-case complexity naturally applies to
undecidable problem In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is proved to be impossible to construct an algorithm that always leads to a correct yes-or-no answer. The halting problem is an ...
s. Suppose \mathcal is an algorithm whose
time complexity In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
, T:I\to \mathbb is polynomial on \mu average. What can we infer about the behavior of \mathcal on typical inputs? Example 1 Let ''I'' be the set of all words over \ and define the size \sigma(w) to be word length, , w, . Define I_n to be the set of words of length ''n'', and assume that each \mu_n is the equiprobable measure. Suppose that ''T(w)=n'' for all but one word in each I_n, and T(w)=2^ on the exceptional words. In this example ''T'' is certainly polynomial on typical inputs, but ''T'' is not polynomial on average. ''T'' is in ''GenP''. Example 2 Keep ''I'' and \sigma(w) = , w, as before, but define \mu(w)= 2^ and T(w) = 2^. ''T'' is polynomial on average even though it is exponential on typical inputs. ''T'' is not in ''GenP''. In these two examples the generic complexity is more closely related to behavior on typical inputs than average case complexity. Average case complexity measures something else: the balance between the frequency of difficult instances and the degree of difficulty,. Roughly speaking an algorithm which is polynomial time on average can have only a subpolynomial fraction of inputs that require superpolynomial time to compute. Nevertheless, in some cases generic and average case complexity are quite close to each other. A function f:I\rightarrow\mathbb^+ is polynomial on \mu-average on spheres if there exists k \geq 1 such that \sum_ f^(w) \mu_n(w) = O(n) where \ is the ensemble induced by \mu. If ''f'' is polynomial on \mu-average on spheres, the ''f'' is polynomial on \mu-average, and for many distributions the converse holds Theorem 5 If a function f:I\rightarrow \mathbb^+ is polynomial on \mu-average on spheres then ''f'' is generically polynomial relative to the spherical asymptotic density \rho'. Theorem 6 Suppose a complete algorithm \mathcal has subexponential time bound ''T'' and a partial algorithm \mathcal for the same problem is in ''ExpGenP'' with respect to the ensemble \ corresponding to a probability measure \mu on the inputs ''I'' for \mathcal. Then there is a complete algorithm which is \mu-average time complexity.


Errorless heuristic algorithms

In a 2006 paper, Bogdanov and Trevisan came close to defining generic case complexity.A. Bogdanov, L. Trevisan, ''Average-case Complexity'', Found. Trends Theor. Comput. Sci. 2, No. 1, 111 p. (2006).. Instead of partial algorithms, they consider so-called errorless heuristic algorithms. These are complete algorithms which may fail by halting with output "?". The class ''AvgnegP'' is defined to consist of all errorless heuristic algorithms ''A'' which run in polynomial time and for which the probability of failure on I_n is negligible, i.e., converges superpolynomially fast to 0. ''AvgnegP'' is a subset of ''GenP''. Errorless heuristic algorithms are essentially the same as the algorithms with benign faults defined by Impagliazzo where polynomial time on average algorithms are characterized in terms of so-called benign algorithm schemes.


References

{{Reflist Computational complexity theory